Approximate Dynamic Programming: a $\mathcal{Q}$-Function Approach
نویسندگان
چکیده
In this paper we study both the value function and Q-function formulation of the Linear Programming (LP) approach to ADP. The approach selects from a restricted function space to fit an approximate solution to the true optimal Value function and Q-function. Working in the discrete-time, continuous-space setting, we extend and prove guarantees for the fitting error and online performance of the policy, providing tighter bounds. We provide also a condition that allows the Qfunction approach to be more efficiently formulated in many practical cases.
منابع مشابه
Approximate String Matching with Ordered q-Grams
Approximate string matching with k differences is considered. Filtration of the text is a widely adopted technique to reduce the text area processed by dynamic programming. We present sublinear filtration algorithms based on the locations of q-grams in the pattern. Samples of q-grams are drawn from the text at fixed periods, and only if consecutive samples appear in the pattern approximately in...
متن کاملQr-tuning and Approximate-ls Solutions of the Hjb Equation for Online Dlqr Design via State and Action-dependent Heuristic Dynamic Programming
A novel approach for online design of optimal control systems based on QRtuning, state and action-dependent heuristic dynamic programming, and approximate-LS solutions of the Hamilton-Jacobi-Bellman (HJB) equation is the main concern of this paper. The QR-tuning for optimal control systems takes into account heuristic variations in the weighting matrices Q and R of the discrete linear quadratic...
متن کاملModal $\mathrm{I}\mathrm{n}\mathrm{t}\mathrm{u}\mathrm{i}\mathrm{t}\mathrm{i}\mathrm{o}\mathrm{n}\tilde{\mathrm{l}}\mathrm{s}\mathrm{t}\mathrm{i}\mathrm{C}$ Logics and Predicate Superintuitionistic Logics: $\mathrm{C}_{0\Gamma \mathrm{r}\mathrm{e}..\mathrm{S}},\mathrm{p}$
In this note we deal with intuitionistic modal logics over $\mathcal{M}\mathcal{I}PC$ and predicate superintuitionistic logics. We study the correspondence between the lattice of all (normal) extensions of MTPC and the lattice of all predicate superintuitionistic logics. Let $\mathrm{L}_{Prop}$ denote a propositional language which contains two modal operators $\square$ and $\mathrm{O}$ , and $...
متن کاملAn Approximate Dynamic Programming Approach to Decentralized Control of Stochastic Systems
In this paper we consider the problem of computing decentralized control policies for stochastic systems with finite state and action spaces. Synthesis of optimal decentralized policies for such problems is known to be NP-hard [15]. Here we focus on methods for efficiently computing meaningful suboptimal decentralized control policies. The algorithms we present here are based on approximation o...
متن کاملHeuristic Dynamic Programming Nonlinear Optimal Controller
This chapter is concerned with the application of approximate dynamic programming techniques (ADP) to solve for the value function, and hence the optimal control policy, in discrete-time nonlinear optimal control problems having continuous state and action spaces. ADP is a reinforcement learning approach (Sutton & Barto, 1998) based on adaptive critics (Barto et al., 1983), (Widrow et al., 1973...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1602.07273 شماره
صفحات -
تاریخ انتشار 2016